SNA-IDS: Social Network Analysis-Augmented Malware Traffic Detection Using Graph-Theoretic Features and Isolation Forest

Authors: Susmitha Hemanth, Chetan M, Vedanth J, Rachana J, Nandan Gowda T M

DOI Link: https://doi.org/10.22214/ijraset.2026.81963

Abstract

This paper introduces SNA-IDS, an innovative intrusion detection framework that brings together Social Network Analysis (SNA) principles and unsupervised machine learning to identify malware-laden network traffic. Whereas most conventional anomaly-detection systems treat each network flow as an isolated observation, SNA-IDS instead represents the entire communication fabric as a continuously evolving graph—with individual hosts as nodes and packet flows as weighted, directed edges—and draws on a rich collection of graph-theoretic descriptors in addition to classical flow-level statistics. We operationalise key SNA concepts—PageRank, betweenness centrality, structural holes, tie strength, triadic closure, community partitioning, homophily, and cascade diffusion—to characterise the behavioural topology that distinguishes normal traffic from adversarial activity. The resulting graph features are concatenated with flow-level attributes and submitted to an Isolation Forest anomaly scorer. A subsequent two-stage reranking step refines the raw alert list by incorporating network-position evidence, and SHAP (Shapley Additive Explanations) delivers per-alert reasoning that security analysts can act on directly. Validation across publicly available malware benchmarks shows that adding SNA-derived features raises detection accuracy by six to nine percentage points compared with flow-only baselines, yielding a 97.8 % true-positive rate with a false-positive rate below 2.1 %. SNA-IDS reliably uncovers Remote Access Trojans (RATs), botnet infrastructures, DNS tunnelling channels, Command-and-Control (C2) networks, and coordinated cascade attacks by exploiting the structural signatures these threats imprint on the communication graph.

Introduction

The paper presents SNA-IDS (Social Network Analysis-based Intrusion Detection System), a novel malware detection framework that combines Social Network Analysis (SNA) with traditional network intrusion detection techniques. Unlike conventional systems that analyze network flows independently, SNA-IDS models network traffic as a directed communication graph, allowing it to capture relationships between hosts and identify complex attack patterns such as botnets, Advanced Persistent Threats (APTs), and Command-and-Control (C2) infrastructures.

The system extracts a 15-dimensional set of graph-based features including PageRank, betweenness centrality, structural holes, clustering coefficient, homophily, community structure, tie strength, cascade reach, and small-world characteristics. These relational features are fused with traditional flow-level attributes and analyzed using an Isolation Forest anomaly detection model. A two-stage graph-aware reranking mechanism further reduces false positives, while SHAP-based explanations provide transparent and interpretable security insights.

The methodology involves constructing a real-time communication graph from network traffic, computing SNA metrics, detecting anomalous hosts through Isolation Forest, and refining alerts using graph context. Concepts from social network theory—such as triadic closure, brokerage, diffusion models, community detection, and weak-tie analysis—are mapped directly to cybersecurity scenarios to identify malicious behaviors that conventional flow-based methods often miss.

The architecture consists of modules for packet capture, flow aggregation, graph construction, SNA feature extraction, anomaly detection, reranking, explainability, and alert generation. Implemented in Python using libraries such as NetworkX, scikit-learn, SHAP, and Louvain community detection, the system supports near real-time operation and scales to enterprise and IoT environments.

Experimental evaluation on benchmark datasets including CICIDS2017, CTU-13, DNS Tunneling, and TON-IoT demonstrates that SNA-IDS improves detection accuracy by 6–9% over traditional flow-based approaches while simultaneously reducing false positives. The results confirm that incorporating graph-based social network features provides richer contextual information, enabling more accurate, interpretable, and scalable malware detection.

Conclusion

This paper has presented SNA-IDS, the first comprehensive integration of Social Network Analysis theory with unsupervised malware traffic detection. By treating network communication as a dynamic social graph and extracting 15 graph-theoretic features—spanning centrality measures, structural holes, tie strength, triadic closure, homophily, community membership, cascade diffusion, and small-world properties—we demonstrate that adversarial network behaviour imprints distinctive and measurable structural signatures. Combining SNA features with traditional flow-level attributes within an Isolation Forest, refined by graph-aware two-stage reranking and explained through SNA-enriched SHAP attributions, the system achieves 97.8 % accuracy and a 2.1 % false-positive rate—surpassing flow-only baselines by 6.5 percentage points in F1 score while remaining entirely unsupervised. The central insight of SNA-IDS is that attackers are social actors embedded in a network topology, and their behaviour is simultaneously shaped by and revealed through their structural position within the communication graph. C2 servers occupy structural holes. Botnets form homophilous communities with small-world propagation dynamics. Malware advances through cascades that obey threshold dynamics. These are not loose analogies—they are quantifiable, measurable phenomena that SNA-IDS operationalises into a production-ready security system. In doing so, this work opens a compelling research agenda at the intersection of network science and cybersecurity.

References

[1] M. Bastian, S. Heymann, and M. Jacomy, \"Gephi: An Open Source Software for Exploring and Manipulating Networks,\" in Proc. ICWSM, San Jose, CA, 2009. [2] F. T. Liu, K. M. Ting, and Z.-H. Zhou, \"Isolation Forest,\" in Proc. IEEE ICDM, Pisa, Italy, 2008, pp. 413–422. [3] S. Wang, L. Sun, S. Qin, W. Li, and W. Liu, \"KRTunnel: DNS channel detector for mobile devices,\" Computers & Security, vol. 120, p. 102818, 2022. [4] H. Neuschmied, M. Winter, U. Klebl et al., \"Two-Stage Anomaly Detection for Network Intrusion Detection,\" in Proc. ICISSP, 2022, pp. 450–457. [5] Q. Ding, Z. Li, P. Batta, and L. Trajkovic, \"Detecting Botnet Traffic by Analyzing Graph Structure of IP Communication,\" in Proc. IEEE AINA, Taipei, Taiwan, 2016, pp. 1–8. [6] M. Iliofotou, P. Pappu, M. Faloutsos et al., \"Network Monitoring using Traffic Dispersion Graphs,\" in Proc. ACM IMC, 2007, pp. 315–320. [7] G. Zhao, K. Xu, L. Xu, and B. Wu, \"Detecting APT Malware Infections Based on Malicious DNS and Traffic Analysis,\" IEEE Access, vol. 3, pp. 1132–1142, 2015. [8] E. Caville, W. Lo, N. Layeghy, and M. Portmann, \"ETGNN: Generalizable Graph Neural Network Intrusion Detection,\" Knowledge-Based Systems, vol. 284, p. 111276, 2024. [9] G. Stringhini, C. Kruegel, and G. Vigna, \"Detecting Spammers on Social Networks,\" in Proc. ACSAC, Austin, TX, 2010, pp. 1–9. [10] Y. Boshmaf, I. Muslukhov, K. Beznosov, and M. Ripeanu, \"The Socialbot Network: When Bots Socialize for Fame and Money,\" in Proc. ACSAC, 2011, pp. 93–102. [11] M. Strom, A. Fachkha, and C. Debbabi, \"Cyber Threat Intelligence Graph Analytics,\" IEEE Trans. Network and Service Mgmt., vol. 17, no. 1, pp. 58–70, 2020. [12] M. Starnini, M. Rad, and A. Baronchelli, \"Emergence of Polarized Ideological Opinions in Multiplayer Games,\" Physical Review Research, vol. 1, p. 023011, 2019. [13] P. Narang, C. Hota, and V. Venkatakrishnan, \"PeerShark: Flow-Clustering and Conversation-Generation for Malicious-Peer Detection,\" in Proc. IEEE Security & Privacy Workshops, 2014, pp. 1–8. [14] [S. M. Lundberg and S.-I. Lee, \"A Unified Approach to Interpreting Model Predictions,\" in Advances in Neural Information Processing Systems (NeurIPS), 2017. [15] R. Visser, F. Fumagalli, M. Muschalik, E. Hüllermeier, and B. Hammer, \"Explaining Outliers using Isolation Forest and Shapley Interactions,\" in Proc. ESANN, Bruges, 2025. [16] M. S. Granovetter, \"The Strength of Weak Ties,\" American Journal of Sociology, vol. 78, no. 6, pp. 1360–1380, 1973. [17] R. S. Burt, Structural Holes: The Social Structure of Competition. Cambridge, MA: Harvard University Press, 1992. [18] L. Page, S. Brin, R. Motwani, and T. Winograd, \"The PageRank Citation Ranking: Bringing Order to the Web,\" Stanford InfoLab Technical Report, 1999. [19] D. J. Watts and S. H. Strogatz, \"Collective Dynamics of \'Small-World\' Networks,\" Nature, vol. 393, pp. 440–442, 1998. [20] A. A. Hagberg, D. A. Schult, and P. J. Swart, \"Exploring Network Structure, Dynamics, and Function using NetworkX,\" in Proc. SciPy, Pasadena, CA, 2008. [21] I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, \"Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization,\" in Proc. ICISSP, 2018, pp. 108–116. [22] S. García, M. Grill, J. Stiborek, and A. Zunino, \"An Empirical Comparison of Botnet Detection Methods,\" Computers & Security, vol. 45, pp. 100–123, 2014. [23] N. Moustafa, \"A New Distributed Architecture for Evaluating AI-Based Security Systems at the Edge: Network TON_IoT Datasets,\" Sustainable Cities and Society, vol. 72, p. 102994, 2021.

Copyright

Copyright © 2026 Susmitha Hemanth, Chetan M, Vedanth J, Rachana J, Nandan Gowda T M. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET81963

Publish Date : 2026-05-04

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here